RethinkDB ships with utilities for doing imports and exports. There are two purposes in this; database backup and restore, and the import of new data.
Importing new data is probably a more interesting challenge, since you have to get your import process to map to what RethinkDB wants.
If you haven’t done this before, it requires a python script for RethinkDB:
apt-get install -y python-pip
pip install rethinkdb
Once you do this, you’ll need to define a database in RethinkDB – my example is a series of JSON exports from the Watson API so I’ve called this “Watson”.
This will let you import a single JSON file:
rethinkdb import -f \
./watson/transcript_s_TextGetEmotion_1608.json \
--table Watson.transcript_s_TextGetEmotion
When you run this, it creates the table automatically. It seems to treat the file as a row (possibly because mine contains one object). If you import it again, you will need to use “–force” because it’s not sure how to reconcile it with the existing table. The “–force” option will put the new data in as new rows.
In my case I have a folder that has all the JSON files, named based on the originating ID and the API they are exporting.
Thus, to import an entire folder, I can do this:
cd watson
for f in *
do
table=$(echo $f | sed "s/\(.*\)_[0-9]\+.json/\1/g")
table=$(echo $table | sed "s/-/_/g")
rethinkdb import -f $f --table Watson.$table --force
done
cd ..
Note that you can’t use “-” in a RethinkDB table name, so you’ll want to replace those with underscores if you have them in your source file names.