Archive for March, 2009
How to LPAD in Excel
Somebody tweeted tonight and asked how to left pad a string in Microsoft Excel 2007. I thought it was an interesting question, and replied. Then, I thought oops … it should be on the blog.
You can use a combination of the REPLACE
and REPT
(believe it or not that’s the repeat function). The first step to left pad requires you to master the REPLACE
function. It takes four actual parameters. The first is the old or base string, which is typically a cell reference. The second is the starting position, the third is the length of the base string to replace, and fourth the new string. The following example assumes you want to replace the whole string with a white space before the original string.
=REPLACE(A1,1,LEN(A1)," "&A1) |
If you don’t like using the ampersand to concatenate, you can replace it with the CONCATENATE
function, like this:
=REPLACE(A1,1,LEN(A1),CONCATENATE(" ",A1)) |
While more compact, the ampersand does the trick and ultimately saves space for complex mega formulas. The next step requires you learn how to use the REPT
function.
The REPT
function takes two parameters, which are the string you want to repeat and the number of times you want to repeat it. Assuming that you expect all strings to be smaller than twenty, you can use the REPT
function this way to create a dynamic padding of physical white space.
=REPT(" ",20-LEN(A1)) |
When you put the concepts together, you’d get the following command to dynamically replace a string in cell A1
with a 20 character cell that is right aligned. The new value would be in the cell where you put this formula.
=REPLACE(A1,1,LEN(A1),REPT(" ",20-LEN(A1))&A1) |
This lets you left pad a string, which I suppose is great when you want to output a flat position specific file. I’d prefer a CSV and a more robust programming environment but I hope it helps those who want to do it.
As Andy pointed out, you can simplify it with this:
=REPT(" ",20-LEN(A1))&A1 |
As an update to the original blog post, here’s a Visual Basic for Applications (VBA) User Defined Function (UDF) that would provide you with an add-in function to perform this task. It’s designed to LPAD
numbers or text, and always return text. You can find how to develop and deploy Excel VBA UDFs in this more recent blog post of mine.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | ' The function takes any input as a cell and manages numbers different than strings. Function LPAD(cell As Variant _ , Optional offset As Variant = 0 _ , Optional padWith As String = "0") ' Return variable. Dim returnString As String Dim whiteSpace As String * 1 ' Reports a meaningful error when a non-numeric offset parameter is provided as the offset call parameter. If Not IsNumeric(offset) Then ' This means the function is called incorrectly but suppresses a #VALUE! error. MsgBox ("Offset should not be a non-numeric value of [" + offset + "]!") ' Dynamically assign the offset for the optional offset variable. ElseIf offset = 0 Then offset = Len(cell) Else ' This allows you to see the offset value when it is provided as a number. ' MsgBox ("offset is [" + CStr(offset) + "]") End If ' Assign default value. whiteSpace = " " ' Ensure size is at least the length of the cell and padding value. If IsNumeric(offset) And Len(cell) > offset Then offset = Len(cell) + Len(padWith) End If ' Assign default padding value when cell is a number. If IsNumeric(cell) And IsNumeric(padWith) Then padNumberWith = CInt(padWith) Else padNumberWith = 0 End If ' Convert to string when numeric, use padWith text value when not null and whitespace if null. If IsNumeric(cell) Then returnString = Application.Rept("0", offset - Len(Application.Text(cell, padNumberWith))) + CStr(cell) ElseIf padWith <> "0" Then returnString = Application.Rept(padWith, offset - Len(cell)) + cell ElseIf padWith = "0" Then returnString = Application.Rept(whiteSpace, offset - Len(cell)) + cell Else returnString = Application.Rept(" ", offset - Len(cell)) + cell End If ' Return formatted string. LPAD = returnString End Function |
The formula for a number is different than it would be natively in Excel. That’s because some behaviors are implicitly provided. The equivalent function in Excel is:
=REPT("0",20-LEN(TEXT(A6,"0")))&A6 |
Microsoft Excel performs the LEN()
function implicitly if you leave it out. For example, this works the same as the more complete formula above.
=REPT("0",20-(TEXT(A6,"0"))&A6 |
If you forget the LEN()
function call in the VBA module, it raises an Excel Error 20150
. This error is suppressed by a #VALUE!
error. That error is returned because an error message can’t be concatenated with a string. Once you identify where the error occurs you can enclose it in a CStr()
function call. The CStr()
function explicitly casts the Err.number
value to a string, which is then returned to the worksheet in the cell where you’ve called the function.
All that being said, it’s not nearly that complicated. You can provide the LPAD
UDF in a simpler function that doesn’t use the built-in workbook functions. All you need to know is how to us the String()
function, like this:
Function LPAD(padding AS Variant, character AS String, cell As String) ' Define and assign a default value, which must be a string data type. Dim character_default As String * 1 character_default = " " Dim number_default As String * 1 number_default = "0" ' Check if you've got adequate values to proceed. If IsNumeric(padding) And IsNumeric(cell) Then If IsNumeric(character) Then LPAD = String(padding, character) + cell Else LPAD = String(padding, number_default) + cell End If Else If IsNumeric(character) Then LPAD = String(padding, character) + cell Else LPAD = String(padding, character_default) + cell End If End If End Function |
You would call this UDF with any of the following function calls. You skip the last parameter because its an optional parameter.
User Defined Function Calls | |
---|---|
Cell | Formula |
A2 | =LPAD(A1,10) |
A3 | =LPAD(A1,30,"-") |
A4 | =LPAD(A1,20,0) |
Basic SQL Query
Somebody suggested that I post a simple quick view of how a basic query works. The following illustrates how the FROM
is read first. Aliases assigned in the FROM
clause, like the i
, replace the full name of the table or view. Column references don’t require prepending with the table or view alias but doing so adds clarity in the query. You do need prepend table aliases or names when two or more columns returned by the query have the same names. This happens when you’re joining two or more tables because the SQL parser finds that they’re ambiguously defined otherwise.
While the column aliases use the optional AS
keyword, table aliases must directly follow the table name. Table name really means a table, view, or inline view name. They can also mean a subquery factoring clause, which is the fancy name for a WITH
statement – blogged on it here.
Selection comes in two phases, identifying the sources or tables in the FROM
clause, and then filtering the sources based on comparisons in the WHERE
clause. Join statements are also filters that match rows from different tables based on value or range comparisons. You find join statements in the FROM
clause as part of a ON
or USING
subclause when queries use key words like JOIN
et cetera. You find join statements in the WHERE
clause when the queries list tables as comma separated elements in a FROM
clause.
Projection is the narrowing of rows into columns qualified by the select list. A select list is the comma separated columns returned by a query in the SELECT
clause.
While database management systems have their own particulars about sequencing and optimization, more or less they find the data sources, read the rows or indexes to rows, and then narrow the columns returned to those qualified in a select list. That’s about it unless a query involves aggregation or sorting operations.
Aggregation typically happens after selection but before projection. While columns in the SELECT
clause often set the aggregation grouping, you may use columns other than those in the select list. You qualify the grouping columns in the optional GROUP BY
clause.
Sorting by a column is done through the ORDER BY
clause. The sorting of data follows the selection process, unless there is an aggregation process. Sorting follows aggregation when it is present in a query. Aggregated data sets are limited to ordering by columns in the GROUP BY
clause.