Postgres Reading Files
I’m working on items for migrating my database class from Oracle to PostgreSQL. I ran into an interesting limitation when I tried using the COPY
command to read an external CSV file.
I had prepared the system by creating a new directory hierarchy owned by the postgres user on top of a /u01/app
mount point. I set the ownership of the directories and files with the following command from the /u01/app
mount point:
chown -R postgres:postgres postgres |
After running the following command:
COPY transaction_upload FROM '/u01/app/upload/postgres/transaction_upload_postgres.csv' DELIMITERS ',' CSV; |
The command raised the following error:
COPY transaction_upload FROM '/u01/app/upload/postgres/transaction_upload_postgres.csv' DELIMITERS ',' CSV; ERROR: must be superuser or a member of the <code>pg_read_server_files</code> role to COPY from a file HINT: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone. |
The two options for fixing the problem are: Changing the student
user to a superuser, and granting the pg_read_server_files
role to the student
user. Changing the student
user to a superuser isn’t really a practical option. So, I connected as the postgres
superuser and granted the pg_read_server_files
role to the student
user. It is a system level role and therefore doesn’t limit the role to only the videodb
database.
As the postgres
user, type the following command to grant the pg_read_server_files
role to the system
user:
GRANT pg_read_server_files TO student; |
After granting the role to the student
user, I created a small test case. The test table definition is:
CREATE TABLE test ( id INTEGER , first_name VARCHAR(20) , last_name VARCHAR(20)); |
I created a test.csv
file in the /u01/app/upload/postgres
directory, like:
1,Simon,Bolivar 2,Peter,Davenport 3,Michael,Swan |
The test.csv
file requires the following permissions and ownerships:
-rw-r--r--. 1 postgres postgres 49 Nov 13 10:56 test.csv |
The permissions are user read-write, groups read, and others read. The ownership should be granted to postgres
and the primary group for the postgres
user, which should also be postgres
.
You can then connect to psql
as the student
user with the database set to videodb
and run the following copy
command:
COPY test FROM '/u01/app/upload/postgres/test.csv' DELIMITERS ',' CSV; |
If you put a comma at the end of each line, like you would do in MySQL, it raises an error. The trailing comma raises the following error:
ERROR: extra data after last expected column |
If you forget a delimiting commas somewhere on a line, the copy command raises the following error:
ERROR: missing data for column "last_name" CONTEXT: COPY tester, line 3: "3,Michael Swan" |
The error
points to the column after the missing column. The context
points to the line number while displaying the text.
You should take careful note that the copy
command is an appending command. If you run it a second time, you insert a duplicate set of values in the target table.
After experimenting, its time to fix my student instance. The transaction_upload_mysql.csv file has two critical errors that need to be fixed. They are:
- A comma terminates each line, which would raise an extra data after last expected column error.
- A comma terminates each line followed by some indefinite amount of whitespace, which would also raise an extra data after last expected column error.
Since I have students with little expertise in Unix or Linux commands, I must provide a single command that they can use to convert the file with problems to one without problems. However, they should copy the transaction_upload_mysql.csv
file to ensure they don’t disable the equivalent functionality for the MySQL solution space.
They should copy two files as the root
user from the mysql
directory to the postgres
directory, as follows:
cp /u01/app/mysql/upload/transaction_upload_mysql.csv /u01/app/postgres/upload/transaction_upload_postgres.csv cp /u01/app/mysql/upload/transaction_upload2_mysql.csv /u01/app/postgres/upload/transaction_upload2_postgres.csv |
As the root user in the /u01/app/upload/postgres directory, run the following command:
cat transaction_upload_postgres.csv | sed -e 's/\,$//g' > x; cat x | sed -e 's/\,[[:space:]]*$//g' > y; mv y transaction_upload_postgres.csv; rm x |
Please check the file permissions and ownerships with the ll
(long list) command. If the file isn’t like this:
-rw-r--r--. 1 postgres postgres 49 Nov 13 10:56 transaction_upload_postgres.csv |
Then, they should be able to change it as the root user with these commands:
chown postgres:postgres transaction_upload_postgres.csv chmod 544 transaction_upload_postgres.csv |
Lastly, they should connect to the psql
as the student
user, using the videodb
database and run the following command:
COPY transaction_upload FROM '/u01/app/postgres/upload/transaction_upload_postgres.csv' DELIMITERS ',' CSV; |
A query of the import table with this:
SELECT COUNT(*) FROM transaction_upload; |
should return:
count ------- 11520 (1 row) |
As always, I hope this helps those looking for some explanation and example on the copy
feature of PostgreSQL.