MacLochlainns Weblog

Michael McLaughlin’s Technical Blog

Site Admin

Archive for the ‘Excel 2008’ Category

Excel doesn’t implicity cast

without comments

It’s always interesting troubleshooting things that students run into when they work with Excel. For example, in Excel when is one not equal to one, or for that matter when is one less than one? If you use the CODE(cell_reference) function, you’d find both cells contain the same ASCII value of 49.

ExcelCellComp

If you’re wondering why the one on the left is less than the one on the right, the one on the right is a text value and the one on the left is a number. If you use the TYPE(cell_reference) function, you’d find that a text field returns a 2 and a number (or date) cell returns a 1. All equality comparisons are identity comparisons in Microsoft Excel, which means they compare type and value. Inequality comparisons, compare type and return a value based on the type number and never compare value when the types differ. There are three other types: (1) A logical value is 4; (2) An error message is 16; and (3) An array is 64.

The answer is when one cell contains a number and the other cell contains a string (text value). That’s done here by putting an apostrophe before the number and right aligning the cell. You solve this by using the VALUE(cell_reference) function to ensure that you’re comparing values not unknown data types because there’s no implicit casting resolution in Microsoft Excel and all comparisons are identity operations.

ExcelValueComp

I haven’t checked Open Office or Numbers, but I’ll do that and update this posting with findings.

Written by maclochlainn

November 10th, 2009 at 7:39 pm

Posted in Excel 2007, Excel 2008

Excel date conversion

with 2 comments

I put together a post on how to upload to MySQL from a CSV file with dates. It was more or less for my students but one of them was curious how the mega formula worked. As commented, the easier solution is to create a custom format. Oddly, Open Office does support the MySQL default format natively.

Excel doesn’t support the native MySQL date format, which is YYYYMMDD, or 20090602 for June 2, 2009. That means you have to convert it from a scalar date to a string. If you just shook your head at the term scalar date, let me explain that Excel supports only three data types. They’re a string literal, a numeric literal, and a formula. Dates in Excel are merely formatted numbers. When the numbers are integers, the date is a date, but when the number has a fractional component, the date is really a timestamp.

Here’s a brief description of the process required to convert a date in Excel into a MySQL date format string literal in a CSV file. You need the following Excel functions:

Date Functions

  • The DAY(date) function returns a 1 or 2 digit numeric value for the day of the month, with ranges of 1 to 28, 1 to 29, 1 to 30, or 1 to 31 dependent on the month and year.
  • The MONTH(date) function returns a 1 or 2 digit numeric value for the month of the year.
  • The YEAR(date) function returns a 4 digit numeric value for the year.

Logical Functions

  • The IF(logical_expression,truth_action,false_action) function returns the truth action when the expression is true, and the false action when the expression isn’t true.

MySQL Server

  • CONCATENATE(string_1, string_2, ...) glues strings together.
  • LEN(numeric_value) function returns the length of a string or number.

MySQL requires that you return an eight character string of numbers. The first four numbers must be a valid year, the fifth and sixth numbers a valid month, and the seventh and eigth numbers a valid day in the context of the year and month provided. Unfortunately, the DAY() and MONTH() functions may return a 1 or 2 digit value. That can’t happen in the CSV file’s string for a date, so you use the IF() and LEN() functions to guarantee a 2 digit return value.

Here are the examples that guarantee 2 digit day and month values, assuming that the base date is in the A1 cell. The concatenation of a "0" (zero between two double quotes) or the "" (two double quotes or a string null) ensures the number data types become strings.

=IF(LEN(DAY(A1)) = 1,CONCATENATE("0",DAY(A1)),CONCATENATE("",DAY(A1)))
=IF(LEN(MONTH(A1)) = 1,CONCATENATE("0",MONTH(A1)),MONTH(A1))

A zero is placed before the day or month when the logical condition is met, which means the day or month value is a single digit string. A null is place before the day or month when the logical condition isn’t met, which means the day or month value is a two digit string.
There’s only one problem with these mega functions. They return a number.

The year calculation doesn’t require the explicit casting when you concatenate it with the other strings because it is implicity cast as a string. However, it’s a better practice to include it for clarity (most folks don’t know about the implicit casting behaviors in Excel).

=CONCATENATE(CONCATENATE("",YEAR(A1)),IF(LEN(MONTH(A1)),CONCATENATE("0",MONTH(A1))),IF(LEN(DAY(A1)) = 1,CONCATENATE("0",DAY(A1)),CONCATENATE("",DAY(A1))))

You can see the full MySQL import from CSV in the previous post. Naturally, you may want to copy and paste special the value before creating the CSV file. Also, don’t forget to delete any unused columns to the right or rows beneath because if you don’t your file won’t map to your table definition.

Written by maclochlainn

June 16th, 2009 at 8:15 pm

MySQL Upload from CSV

without comments

You have one approach to uploading CSV file data into a MySQL database. You use the LOAD statement. There two caveats about using the LOAD command, and they are:

  • You must convert dates to the default MySQL format – YYYYMMDD, or the four digit year, two digit month, and two digit day.
  • You can’t use AUTO_INCREMENT from the source to the loading table.

It’s untrue that you must input dates as strings and convert them in the database. However, you can do that if you like. I’d suggest you open the file in Microsoft Excel, then create a column with the correct data format. You can do that by creating a custom format, or with a formula. I figure the custom format is easiest but sometimes a formula is necessary for applications.

You should note that loading a date at the end of a line has problems in MySQL. The easiest fix that I’ve found is to place a comma at the end of each line. In a Microsoft world, that eliminates the need for the \r from the LINES TERMINATED BY clause.

While this shows a trivial amount of data, here’s a CSV file to use in the test case. Create a directory like Data off your base logical drive or mount point. Then create a file named transaction_upload.csv and put the following data set in it.

20090102,5
20091231,5
20091128,5
20090616,5

Create an importing.sql file in the C:\Data folder with the contents below. It’s only a recommendation but generally import tables aren’t useful in between uploads, and that’s why I create the table as an in-memory table. In-memory tables are dropped when you recycle (boot) the MySQL instance.

-- Conditionally drop the table.
DROP TABLE IF EXISTS transaction_upload;
 
-- Create the new upload target table.
CREATE TABLE transaction_upload
( transaction_date   DATE
, transaction_amount FLOAT ) ENGINE=MEMORY;
 
-- Load the data from a file, don't forget the \n after the \r on Windows or it won't work.
LOAD DATA INFILE 'c:/Data/mysql/transaction_upload.csv'
INTO TABLE transaction_upload
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r\n';
 
-- Select the uploaded records.
SELECT * FROM transaction_upload;

Before you connect as anything other than the root account, you’ll need to grant the global FILE privilege to the targeted user. This is true because you’ve not qualified a LOCAL file in the import statement. Users that have all privileges on a given database also have read-write access to LOCAL tables, which is read write to client-side files. When the LOCAL key word is omitted, you’re instructing a read from the server by the client tool. You must therefore grant a global permission to enable a connected user can call a server-side file.

If you want to import without granting any additional global permissions, you can rewrite the statement like this:

LOAD DATA LOCAL INFILE 'c:/Data/mysql/transaction_upload.csv'
INTO TABLE transaction_upload
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r\n';

If you don’t want to rewrite the statement, you’ll get the following error:

ERROR 1045 (28000): Access denied FOR user 'student'@'localhost' (USING password: YES)
Empty SET (0.01 sec)

Run the script:

mysql> \. importing.sql

Then, you should see the following:

+------------------+--------------------+
| transaction_date | transaction_amount |
+------------------+--------------------+
| 2009-01-02       |                  5 |
| 2009-12-31       |                  5 |
| 2009-11-28       |                  5 |
| 2009-06-16       |                  5 |
+------------------+--------------------+
4 rows in set (0.00 sec)

Another quick tidbit, dates can be tricky when they’re the last column in the file. The best solution is to put a comma at the end of each string. If you put the comma at the end of each line, you can also dispense with the \r on the Windows platform.

Written by maclochlainn

June 16th, 2009 at 6:42 pm

How to LPAD in Excel

with 9 comments

Somebody tweeted tonight and asked how to left pad a string in Microsoft Excel 2007. I thought it was an interesting question, and replied. Then, I thought oops … it should be on the blog.

You can use a combination of the REPLACE and REPT (believe it or not that’s the repeat function). The first step to left pad requires you to master the REPLACE function. It takes four actual parameters. The first is the old or base string, which is typically a cell reference. The second is the starting position, the third is the length of the base string to replace, and fourth the new string. The following example assumes you want to replace the whole string with a white space before the original string.

=REPLACE(A1,1,LEN(A1)," "&A1)

If you don’t like using the ampersand to concatenate, you can replace it with the CONCATENATE function, like this:

=REPLACE(A1,1,LEN(A1),CONCATENATE(" ",A1))

While more compact, the ampersand does the trick and ultimately saves space for complex mega formulas. The next step requires you learn how to use the REPT function.

The REPT function takes two parameters, which are the string you want to repeat and the number of times you want to repeat it. Assuming that you expect all strings to be smaller than twenty, you can use the REPT function this way to create a dynamic padding of physical white space.

=REPT(" ",20-LEN(A1))

When you put the concepts together, you’d get the following command to dynamically replace a string in cell A1 with a 20 character cell that is right aligned. The new value would be in the cell where you put this formula.

=REPLACE(A1,1,LEN(A1),REPT(" ",20-LEN(A1))&A1)

This lets you left pad a string, which I suppose is great when you want to output a flat position specific file. I’d prefer a CSV and a more robust programming environment but I hope it helps those who want to do it.

As Andy pointed out, you can simplify it with this:

=REPT(" ",20-LEN(A1))&A1

Written by maclochlainn

March 4th, 2009 at 12:17 am

Posted in Excel 2007, Excel 2008

Excel string parsing

with 3 comments

Parsing strings isn’t the easiest thing to do in Excel 2007 or Excel 2008 but it is an important thing to know how to do. You’ll learn how to parse a set names into first, middle and last names. You can find the data set for these examples at the bottom of the blog post.

Written by maclochlainn

February 27th, 2009 at 10:15 pm

Excel 2007 AND() function

without comments

Since Excel 2007 and 2008 don’t support a direct exclusive or=XOR() – function, you’d think the logical operators could do it. Something like this:

=AND(NOT(AND(A1,B1)),OR(A1,B1))

It surprised me when it didn’t work. Curious, I figured out why it wouldn’t work. The AND() function doesn't work as qualified in the help file.

excellogicdefinitions

The AND() function only works correctly when both cell values are not null values. It returns a Boolean true when one value is true and the other value is null, which was quite a surprise.

Fortunately, the IF() function works correctly. If you check a cell with a null value, the IF() function finds it false. That means you can wrap the cells in the IF() function, like:

=AND(NOT(AND(IF(A1,TRUE(),FALSE()),IF(B1,TRUE(),FALSE()))),OR(IF(A1,TRUE(),FALSE()),IF(B1,TRUE(),FALSE())))

Naturally, you can substitute the respective string literal for either the TRUE() or FALSE() function calls. You could also rewrite the formula to check for a null value by using two double quotes. By the way, it's fun to note that the TYPE() function recognizes a null value cell as a number data type.

It's ashame it doesn't work the way you'd think it should.

Written by maclochlainn

February 15th, 2009 at 9:48 pm

Posted in Excel 2007, Excel 2008

Add Image to Cell Comment

with one comment

How you add an image to a comment isn’t exactly what I’d call easy but it is straightforward. One of my students noted that a tip in Microsoft Office Excel 2007 Bible didn’t provide the magic to insert an image.

I’d have to say that it did but it didn’t. Basically, the general idea is there but a step of magic requires something special. You need to place your cursor on the highlighted edge of an editable comment before right clicking to launch the context menu. If your cursor is on the text portion of the comment, the context menu operates differently. The Format Comment … menu choice launches a different set of options dependent on your location in an editable comment.

Hoping to help my students and others learn the magic, here are some steps.

Written by maclochlainn

January 29th, 2009 at 1:04 pm

Posted in Excel 2007, Excel 2008

Excel 2008 missing more of Excel 2007

without comments

I remember when the Mac version of the office suite was like the Windows version. It’s so disappointing to find more shortfalls between the two. Here is a dialog that appears in Excel 2008 when you try to open an Excel 2007 workbook that includes the powerful new table components:

A year after the release of Excel 2007, and Microsoft didn’t include structured references, Excel tables, or table styles … What were they thinking about up in Redmond? Were they thinking that they should release something less on a Mac to sell Windows Vista? It couldn’t be that, could it?

Written by maclochlainn

December 4th, 2008 at 11:58 pm

Dates unequal between Excel 2007 & 2008

with one comment

I was doing some cross product testing between Excel 2007 (PC) and Excel 2008 (Mac) and got a big surprise! The number representing a date differs by 1,462. Excel 2007 starts using integers at 1 equal to 1-Jan-1900, but Excel 2008 starts with 0 equal to 1-Jan-1904. It clearly means that dates should be uploaded as formatted strings rather than numbers when loading them into a database. That also appears to mean any shared computations linked to dates across a collection of PC and Mac computers fails unless you account for the difference. I’ve updated that earlier post on what’s true with the details.

Written by maclochlainn

November 25th, 2008 at 6:00 pm

Posted in Excel 2007, Excel 2008

What’s true in an Excel IF statement?

with one comment

As I occasionally teach an overload freshman class on Microsoft Excel, it has become very interesting to observe how students perceive the truth of a statement. They generally get the idea that an expression evaluates the equality or inequality of expressions. The idea that value x equals value y (x=y), value x is less than value y (x>y) makes sense but when cell coordinates are substituted it can be a bumpy ride to complete understanding.

The idea of relative cell coordinates (or references) versus absolute cell coordinates takes a bit for some. Raising the ante with partially relative, partially absolute, or as most Microsoft Excel books label them mixed references becomes a challenge that is best overcome by experimentation. I label that type of learning as the Mario Brother’s paradigm, success at level 5 is generally preceded by success at level 4.

A nasty twist though is that Microsoft Excel doesn’t limit truth to the results of expressions. A number or date (a date is only a number with a format mask) is always true provided it isn’t equal to zero. Also, the strings TRUE and FALSE are respectively true or false, as if they were written as the =TRUE() or =FALSE() functions. Any other string value returns a #VALUE! expression. Microsoft Excel does this because it applies a weakly typed programming language rule (similar to Perl or PHP), which assumes any positive or negative number is true, while a zero value is false. This really opens up the use of the =IF(conditional_expression, true_outcome, false_outcome) function to advanced users but can be a stunner to new ones. By the way, Microsoft Excel labels the conditional expression as a logical test but the problem is that the rules governing logical tests generally don’t make sense to non-programmers.

Dates formatting is a bit quirky in Microsoft Excel. It would be wonderful if they’d published rules, noted exceptions, and such. My favorite quirk is that you can’t apply a format mask to any number in the range of (((2^16)*2)-70) to (((2^16)*2)+51) in Microsoft Excel 2007. You can format that range in Microsoft Excel 2008 on the Mac. Beyond that oddity, date numbering starts with 0 being equal to 1-Jan-04 in Excel 2008, while 0 formats to 0-Jan-1900 in Excel 2007. Negative numbers also format in Excel 2008 with a negative sign pre-pending the date.

The mystery question is: Where’s the next gap in Excel 2007? As you can see in the image, negative numbers can’t be date formatted in that version. However, you can evaluate whether they’re true or false notwithstanding the formatting error – those #’s. If you use the =DATE(2258,9,1) you create a date in the missing range – amazing.

Obviously, the Microsoft Engineers have a hack of some kind in place in Excel 2007 because a typical unsigned short is ((2^16)*2), while the =DATE(2258,9,1) function works in the offending range. Perhaps the DATE() function works because it uses an integer. I’d bet there’s another gap too. Gaps are shown in the next image.

As noted by Laurent in his comment, 0-Jan-1900 is false on Excel 2007, which makes sense because it’s an invalid date. Excel 2008 also treats a zero as false, but zero is a valid 1-Jan-1904 date. This means the numeric value of a date differs by 1,462 between the two versions of Excel. That makes dates non-portable between Windows and Mac version of the same product – Wowie!

It is possible to fix Excel 2007 to work with Excel 2008 files. All you need to do is click on the Options button, and then choose the Advanced selection. Inside the Advanced selectoin, you’ll find a When calculating this workbook section. Simply click the Use 1904 date system checkbox to enable your Excel 2007 to work like Excel 2008. Unfortunately, I can’t find an equivalent to set Excel 2008 to work like Excel 2007.

exceldateoption

You enter formulas by typing them in the cell or clicking the insert formula button. The button launches a wizard that lets you walk through the basics of entering a formula. An =IF() function looks like this in the formula bar:

excelifformulabar

Cases for entering a logical expression vary. There are some simple rules. They are:

  • Numbers can be compared directly.
  • Text must be enclosed by double quotes inside logical expressions.
  • Formulas are entered without prefacing equal symbols (=).

The following show you how to create some basic logical expressions inside the function wizard for an =IF() function.

1. The equality of numbers, which returns the true statement:

excelif1

2. The non-equality of numbers, which returns the false statement:

excelif2

3. The relation between text. Text is compared based on the ASCII values of the characters in the strings. Capital A maps to 65, capital B maps to 66, et cetera, and lowercase a maps to 91, lowercase b maps to 92, et cetera. Notice that the text “one” is not greater than “only” because the ASCII value of an “e” is lower than the ASCII value of an “l”. The comparison stops with the first letter that differs.

excelif3

The next teaching point is the idea or when two things must be true, or when at least one thing of two things must be true for an expression to be true. It introduces the students to basic truth statements from Philosophy 101 (which they may not have taken). These are implemented as the =AND() function and =OR() functions, which are illustrated in the truth tables below.

When they understand what’s true and when it is true, they can master any combination of truth or non-truth. Until the concepts are clear it seems impossible, or at least very difficult, for end-users to understand or use the pre-built functions in Excel 2007 or Excel 2008.

I’d love to see cool learning vehicles, Flash widgets and such that I could leverage in the classroom. If you know of any, I’d appreciate a comment directing me and other readers there.

Written by maclochlainn

November 24th, 2008 at 12:28 am

Posted in Excel 2007, Excel 2008