MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Load XML Local Infile

with 9 comments

Having downloaded and tested some basics of the MySQL 5.5 Release Candidate, I started checking out the new features. While testing the new LOAD XML LOCAL INFILE feature, I discovered that there are restrictions governing the configuration of source XML files.

You must restrict the XML file to a list of tag names that correspond to column names within a tag defined as <row>. The tag names are case sensitive to your column names. You can replace the <row> tag name with any name of your choosing provided you append a clause that maps rows to your substitution XML tag name.

You can’t convert a file that has multiple child XML tags with the same name. Any attempt simply loads the last tag name found in the row hierarchy. Therefore, you should ensure that all source files have a unique list of case-sensitive child tags that map to the column definitions of the import table.

Either of the following table definition provides for lowercase column names. The first one uses nothing to delimit the column names.

CREATE TABLE CHARACTER
( ROLE CHAR(30) NOT NULL
, actor CHAR(30) NOT NULL
, part CHAR(20) NOT NULL
, film CHAR(100) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

If you run SHOW CREATE TABLE character; you would see the more conventional definition below.

CREATE TABLE `character`
( `role` CHAR(30) NOT NULL
, `actor` CHAR(30) NOT NULL
, `part` CHAR(20) NOT NULL
, `film` CHAR(100) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

The following source file format supports the table definition because the XML tags are lowercase names. The source file wouldn’t work if the column names where uppercase or mixed case names.

<row>
  <role>Indiana Jones</role>
  <actor>Harrison Ford</actor>
  <part>protagonist</part>
  <film>Indiana Jones and Raiders of the Lost Ark</film>
  <film>Indiana Jones and the Temple of Doom</film>
  <film>Indiana Jones and the Last Crusade</film>
</row>

When the XML tags and column names match in case, you can load the file with the following syntax:

LOAD XML LOCAL FILE 'C:/Data/XML/character.html'
INTO TABLE `character`;

If you name the <row> tag <character>, you need to add a clause as noted below.

LOAD XML LOCAL FILE 'C:/Data/XML/character.html'
INTO TABLE `character`
ROWS IDENTIFIED BY '<character>';

If you have two film tags in a row tag, MySQL 5.5 doesn’t raise an error or warning. It simply loads the last <film> value. If you create tables with column names that don’t match on a case-sensitive basis, you’ll receive a 1263 warning message. You can see the warning message with the following command.

SHOW warnings;

The warning message only occurs when a column is not null constrained and the column name fails to match an XML tag attribute in the source file. No error or warning is raised when a column isn’t not null constrained under the same scenario. You can test it and then show warnings.

You should see something like this:

+---------+------+---------------------------------------------------------------------------------+
| Level   | Code | Message                                                                         |
+---------+------+---------------------------------------------------------------------------------+
| Warning | 1263 | COLUMN SET TO DEFAULT VALUE; NULL supplied TO NOT NULL COLUMN 'Role' at ROW 1   |
| Warning | 1263 | COLUMN SET TO DEFAULT VALUE; NULL supplied TO NOT NULL COLUMN 'Actor' at ROW 1  |
| Warning | 1263 | COLUMN SET TO DEFAULT VALUE; NULL supplied TO NOT NULL COLUMN 'Part' at ROW 1   |
| Warning | 1263 | COLUMN SET TO DEFAULT VALUE; NULL supplied TO NOT NULL COLUMN 'Film' at ROW 1   |
+---------+------+---------------------------------------------------------------------------------+

When you define a table with the AUTO_INCREMENT set to a value other than 1, the LOAD XML LOCAL FILE command resets the sequence to 1 before loading. I’d recommend you import into a table without an auto incrementing column and then use the REPLACE INTO to set the surrogate key values of auto incremented columns.

Written by maclochlainn

September 26th, 2010 at 1:56 am

Posted in MySQL,xml

9 Responses to 'Load XML Local Infile'

Subscribe to comments with RSS or TrackBack to 'Load XML Local Infile'.

  1. Thanks for posting. Very helpful. Another thing to mention is that if there’s an XML tag that is not also present in the database, the tag and it’s value will be ignored with no warnings or errors.

    JayJay

    21 Oct 11 at 11:08 am

  2. Is there a way to load multiple records instead of just loading last record. Appreciate your help.

    Ravi

    14 Nov 12 at 9:43 pm

  3. Yes, you simply import using the root node for all entries. Here’s an example:

    <?xml version=1.0"?>
    <row>
      <role>Indiana Jones</role>
      <actor>Harrison Ford</actor>
      <part>protagonist</part>
      <film>Indiana Jones and Raiders of the Lost Ark</film>
    </row>
    <row>
      <role>Indiana Jones</role>
      <actor>Harrison Ford</actor>
      <part>protagonist</part>
      <film>Indiana Jones and the Temple of Doom</film>
    </row>
    <row>
      <role>Indiana Jones</role>
      <actor>Harrison Ford</actor>
      <part>protagonist</part>
      <film>Indiana Jones and the Last Crusade</film>
    </row>

    maclochlainn

    14 Nov 12 at 11:57 pm

  4. Sorry, I did not follow the solution. Let me explain my problem more in detail.

    I am using MySQL 5.6 for loading XML file into database table. I found “LOAD XML” is very useful for this purpose. However, I am facing a problem in loading data of child tag. Below is sample XML file.

    Code:

    LOAD XML LOCAL INFILE 'tmp.xml'
    REPLACE INTO TABLE test_1
    ROWS IDENTIFIED BY '';

    As you can see titlearts, wideposters, posters have image tag. If i use load xml and specify tag, It will load all 4 records. But i want to load only data from posters tag. If i use tag , It is only loading last record. But i want to load both 2 image url’s with in posters tag. How can i do this? Appreciate your help on this.

    Ravi

    15 Nov 12 at 9:59 am

  5. Somehow XML i provided is not getting shown on the site. Tried two times. If possible, please provide me your email id. I will send you details by email. Appreciate your help.

    Ravi

    15 Nov 12 at 10:02 am

  6. Ravi,

    Try this, which comes from my Oracle Database 11g and MySQL 5.6 Developer Handbook. Let me know if you have more help through a comment or use my contact form.

    LOAD XML INFILE '/tmp/tmp.xml'
    INTO TABLE test_table
    ROWS IDENTIFIED BY '<row>';

    The XML <row> tag becomes the delimiter. Hope that helps.

    Michael

    maclochlainn

    16 Nov 12 at 1:01 am

  7. I actually review all comments and approve them. Sometimes, life commands my time for my day job, family, et cetera. Good luck and let me know how it works out.

    maclochlainn

    16 Nov 12 at 1:02 am

  8. Thanks Michael. I got it working using a stored procedure with help of extractValue function by looping through multiple images under posters tag.

    Ravi

    16 Nov 12 at 6:00 pm

Leave a Reply