MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Archive for September, 2010

Load XML Local Infile

with 9 comments

Having downloaded and tested some basics of the MySQL 5.5 Release Candidate, I started checking out the new features. While testing the new LOAD XML LOCAL INFILE feature, I discovered that there are restrictions governing the configuration of source XML files.

You must restrict the XML file to a list of tag names that correspond to column names within a tag defined as <row>. The tag names are case sensitive to your column names. You can replace the <row> tag name with any name of your choosing provided you append a clause that maps rows to your substitution XML tag name.

You can’t convert a file that has multiple child XML tags with the same name. Any attempt simply loads the last tag name found in the row hierarchy. Therefore, you should ensure that all source files have a unique list of case-sensitive child tags that map to the column definitions of the import table.

Either of the following table definition provides for lowercase column names. The first one uses nothing to delimit the column names.

CREATE TABLE CHARACTER
( ROLE CHAR(30) NOT NULL
, actor CHAR(30) NOT NULL
, part CHAR(20) NOT NULL
, film CHAR(100) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

If you run SHOW CREATE TABLE character; you would see the more conventional definition below.

CREATE TABLE `character`
( `role` CHAR(30) NOT NULL
, `actor` CHAR(30) NOT NULL
, `part` CHAR(20) NOT NULL
, `film` CHAR(100) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

The following source file format supports the table definition because the XML tags are lowercase names. The source file wouldn’t work if the column names where uppercase or mixed case names.

<row>
  <role>Indiana Jones</role>
  <actor>Harrison Ford</actor>
  <part>protagonist</part>
  <film>Indiana Jones and Raiders of the Lost Ark</film>
  <film>Indiana Jones and the Temple of Doom</film>
  <film>Indiana Jones and the Last Crusade</film>
</row>

When the XML tags and column names match in case, you can load the file with the following syntax:

LOAD XML LOCAL FILE 'C:/Data/XML/character.html'
INTO TABLE `character`;

If you name the <row> tag <character>, you need to add a clause as noted below.

LOAD XML LOCAL FILE 'C:/Data/XML/character.html'
INTO TABLE `character`
ROWS IDENTIFIED BY '<character>';

If you have two film tags in a row tag, MySQL 5.5 doesn’t raise an error or warning. It simply loads the last <film> value. If you create tables with column names that don’t match on a case-sensitive basis, you’ll receive a 1263 warning message. You can see the warning message with the following command.

SHOW warnings;

The warning message only occurs when a column is not null constrained and the column name fails to match an XML tag attribute in the source file. No error or warning is raised when a column isn’t not null constrained under the same scenario. You can test it and then show warnings.

You should see something like this:

+---------+------+---------------------------------------------------------------------------------+
| Level   | Code | Message                                                                         |
+---------+------+---------------------------------------------------------------------------------+
| Warning | 1263 | COLUMN SET TO DEFAULT VALUE; NULL supplied TO NOT NULL COLUMN 'Role' at ROW 1   |
| Warning | 1263 | COLUMN SET TO DEFAULT VALUE; NULL supplied TO NOT NULL COLUMN 'Actor' at ROW 1  |
| Warning | 1263 | COLUMN SET TO DEFAULT VALUE; NULL supplied TO NOT NULL COLUMN 'Part' at ROW 1   |
| Warning | 1263 | COLUMN SET TO DEFAULT VALUE; NULL supplied TO NOT NULL COLUMN 'Film' at ROW 1   |
+---------+------+---------------------------------------------------------------------------------+

When you define a table with the AUTO_INCREMENT set to a value other than 1, the LOAD XML LOCAL FILE command resets the sequence to 1 before loading. I’d recommend you import into a table without an auto incrementing column and then use the REPLACE INTO to set the surrogate key values of auto incremented columns.

Written by maclochlainn

September 26th, 2010 at 1:56 am

Posted in MySQL,xml

OOW2010 – Day 4

without comments

The last day of Oracle Open World 2010. My focus today was on attending OracleDevelop and JavaOne.

The weather picked up today and it was a nice warm Indian Summer day. There were lots of tourists out too. The photo is taken of the building where you board the Powell Street cable car. There was a lot of walking with the way the events are dispersed among the Moscone South, Moscone West, Marriott Hotel, Westin Hotel, Hilton Hotel, W Hotel, and Hotel Nikko. I can’t quite remember how many times I walked back and forth across the 6 blocks between the Moscone centers and Union Square hotels. I can tell you that we went only twice to Mel’s Drive-in.

It was amazing to see how quickly the various conference expedition centers shutdown, packed up, and had their materials shipped out. We had to step over all the plywood that protected tiles and carpets to attend events.

Day 4 also brought a smaller audience for venues. I’m not quite sure if they left earlier or slept in because they were out too late last night attending the Wednesday night event on Treasure Island. The reduced number of attendees was great for those of us who remained. You can see how few attended the .NET/Oracle hands-on lab in the Hilton, which made finding a nice spot easy. You can find the .NET/Oracle hands-on tutorial materials on the Oracle Technical site.

If you opt to use the tutorials, you may benefit from these hints. You should be able to avoid some of the issues that I ran into when working through the open labs. First, you should expand the Microsoft Studio to full screen. Second, you should look for context pop-ups attached to a small arrow at the top right corner of grids, et cetera. Lastly, there are a few small mistakes that you’ll need to work through. Look at the errors as an opportunity to think and experiment and they’re great basic .NET/Oracle tutorials.

Oracle Open World 2010 is done. Time to review the keynotes for those things that I missed while listening to them, and consider the new role of Tuxedo in the life of Oracle’s product stack. It’s also time to download and play with the MySQL 5.5 candidate release; and it’s time to kick off my shoes, put up my feet, and play with the technology again.

Tomorrow I turn my fate over to the airlines, and hope to arrive home on schedule.

Written by maclochlainn

September 24th, 2010 at 1:25 am

Posted in .NET,MySQL,Oracle

OOW2010 – Day 3

without comments

Oracle Press authors met at the bookstore to sign books for an hour this morning and visit with our editors (shown at left). Then, we went off to the OTN Lounge and JavaOne at the Hilton.

It’s interesting to see different Java solution sets, especially on embedded devices. I also got a good look at the Vaadin server-side AJAX framework. It was interesting. The QuickConnect AJAX framework isn’t represented at the conference, but it’s a free open source project that may interest some.

Java’s dominance in the Oracle product stack is clear. Java is the crux of the SOA architecture for Oracle’s middleware solutions. Interestingly, I’ve met a number who are new to Java. A number of the vendors here are looking for skilled Java programmers, which leads me to see a lot of opportunity for developers.

I snagged a copy of Oracle Essbase and Oracle OLAP today. Oracle Essbase is part of Oracle Fusion Middleware 11g. I’ve began reading the book this evening. It appears a good starting place for those exploring Oracle Essbase and Oracle OLAP solution space.

Written by maclochlainn

September 23rd, 2010 at 12:14 am

OOW2010 – Day 2

without comments

This was a day with a busy schedule because of the publisher’s seminar. It’s where the authors, like me, learn about Oracle products and market focus. Andy Mendolsohn went over the Oracle Database 11g R2, and other VPs presented BI/ERP solutions and the effect of Oracle’s Essbase Plus (previously known as Hyperion), Java’s position and direction, Solaris’ position and direction, and the Fusion Applications. A summary of the highlights I noted follows:

  • The Oracle Database 11g R2 presentation explained the idea of quarterly patch set updates (PSUs), advanced compression, the new Oracle Database Firewall, OCFS (Oracle Clustered File System), and the Exadata Server. Three quarterly PSUs, then a point release is a change but a welcome one.
  • Advanced compression qualified three subgroups: (a) A 3 times compression for OLTP systems; (b) A 10 times compression for data warehousing, and (c) a 15 to 50 times compression for archive data. If we apply this with the capital cost figure of 40% for storage as valid, compression may substantially reduce costs.
  • They shared Gartner statistics that 82% of Fortune 500 companies use Oracle BI/ERP solutions was very interesting. The idea that the largest customer deploys an Oracle BI/ERP solution to 1 million users turns the demarcation between traditional data warehouse explorers and farmers into a historical footnote. They positioned Oracle BI EE Plus as targeted to the development of dashboard and BI Publisher (formally XML Publisher). Oracle Essbase Plus for Model-OLAP (Online Analytical Processing – focused on resolving the discrepancies between R-OLAP (Relational-OLAP) and M-OLAP (Model-OLAP). Oracle Scorecard and Strategy Management tools are reserved to planning Long-Range Objectivs (LROs), Short-Range Objectives (SROs), and Key Indicators (KPIs). The last requires a carefully modeling of the objective needs of the business and data model capability.
  • The Exadata server now comes in two types, the X2-2 and X2-8. The first may be a quarter, half, or full Exadata server while the latter is always a full Exadata server. The X2- has 128 cores, 2 TB of memory, a Linux/Solaris 10 GB Ethernet connection, 14 storage servers with 6 core CPU in each storage server.
  • Oracle’s commitment to Java is very solid but a question on NetBeans leaves one to believe that it isn’t part of Oracle’s solution space and it may not see much if any evolutionary change. They said that NetBeans would be supported to help external Open Source developers. My guess is that we should plan to migrate NetBeans or bear the cost of owning solutions with a higher integration cost.
  • Solaris has come through the merger with new strength and a commitment to four goals that I caught: availability, scalability, and efficiency, and security. They also committed to make Solaris fully virtualizable, which will more effectively support private clouds. You may also note that Oracle has improved efficiency of MySQL 5.5 on the Solaris platform.
  • Fusion Middleware Applications focus on: (a) Best Practice – leveraging the best in class of applications from the Oracle eBusiness Suite, PeopleSoft, JD Edwards, and Siebel; (b) Re-inventing User Experience – A role-based user interface, embedded decision support, and pervasive collaboration; (c) Cloud – Support for both private and public clouds; (d) Platform – Standard-based, configurable, adaptive, and secure Fusion middleware. Customers will have three options for the future. They may continue the current path and upgrade eBusiness Suite to eBusiness Suite, et cetera. Alternatively, they can adopt a co-existence strategy that lets them mix and match components form multiple suites. Lastly, they may embrace the complete suite as a whole.

After the Publisher’s Seminar, we went to the Nikko for Oracle Develop. On the way, we had to cross Powell Street. I snapped a photo of John Harper, co-author of Oracle Database 11g Programming Workbook, with the Cable Car. The Nikko’s door is on the right, and the sessions were on the third floor.

After one session, we went back to the Moscone South Exhibition hall to snag a copy of Guy Harrison’s Oracle Performance Survival Guide: A Systematic Approach to Database Optimization. It’s the fresh version of his prior SQL Tuning books. I’d already bought one two months ago, but I couldn’t resist a signed copy for the office when it was free. I recommend the book as worthwhile and have promised Guy to write a review when I get home.

At the conclusion of the day, John and I went to the Oracle Press party to meet other authors. Ric Niemiec, founder of TUSC and author of Oracle Database 10g Performance Tuning: Tips and Techniques, and we took a photo. He’s also mentioned that the next edition covering Oracle Database 11g will include coverage of the Exadata Server. It’s awesome that he gets to write after the terminal release of the database.

Written by maclochlainn

September 22nd, 2010 at 2:15 am

OOW2010 – Day 1

without comments

I started the day at JavaOne. It’s at the Hilton on Mason Street. I attended a couple sessions on Java. Then, I went to the JavaDEMO Grounds to check things out rather than catch a sandwich.

There were a few interesting venues but I found the JBoss presentation the most useful on Seam 3. The presentation was worth the time, and the demonstration using the technology at the end was effective for a couple reasons. While the presenter’s environment wasn’t ready, he was able to fix it while working through his discussion of the technical stack. Also, the use of Seam 3 was first class. I’d recommend you stop by if you have a chance to see it.

After seeing the JavaDEMO Grounds, I went back to the Marriott to attend the Data Integration hands-on lab with Data Integrator and the Database Performance Diagnostics and Tuning presentations.

Afterward, a friend and I went to Moscone South to the Exhibition Hall. There we found the air hockey table and other game room. Only one here this year. We finished up by hitting the Moscone South Exhibition Hall. Saw some interesting hardware demos and that’s a wrap for Day 1 at Oracle Open World 2010. Actually, my second day because MySQL Sunday counts in my book.

Written by maclochlainn

September 21st, 2010 at 12:53 am

Posted in Java,Oracle

MySQL Sunday

without comments

The merry-go-round is running and Oracle Open World 2010 has begun with MySQL Sunday. The merry-go-round is a fixture at the northwest corner of the Moscone South building, at the intersection shared between the Moscone North building.

The keynote was interesting because Oracle confirmed that they have and will continue to invest in MySQL. The MySQL 5.5 Candidate Release is now available for download. Key features that you’ll note are 200% to 300% improved performance, the InnoDB is now the default engine, backup and recovery are dramatically improved, and the enterprise model is integrated to provide more information about internals and performance.

A little research for the comprehensive new feature list for MySQL 5.5 found a well documented page in the MySQL 5.5 Reference. Examples of things left out of the presentation, not an inclusive list of all features, are: improved operation on the Solaris platform (surprise ;-)), support for semisynchronous replication, support for SQL standard SIGNAL and RESIGNAL statements, support for Unicode character sets, a LOAD XML statement, expanded partitioning options – including the ability to truncate only a partition of a table, and may new command options.

Ronald Bradford gave a good presentation on MySQL idiosyncrasies. I thought spelling out behaviors common to Windows and Mac platforms that differ from Linux and Unix platforms was great. He’s got a new book that he co-authored on MySQL and PHP. I’ve look into the book yet.

I also enjoyed Giuseppe Maxia’s presentation on partitioning. His regular web site is here, but beware if you’ve a slow web connection because the photos delay page loading. 😉

Update on MySQL Sunday

Here are some links provided by Oracle to presentations:

  • Edward Screven’s presentation
  • Live Webinar: “Delivering Scalability and High Availability with MySQL 5.5 Replication Enhancements

Written by maclochlainn

September 19th, 2010 at 2:30 pm

Posted in InnoDB,LAMP,MAMP,MySQL