MongoDB GridFS is a specification for storing large files. (16 MB or more) All MongoDB drivers support GridFS. GridFS split the file into many chunks and store each chunk in each document.
GridFS stores documents in two collections.
- Chunk Collection.
- File Collection.
The first collection stores the binary chunk and the second collection store the meta data of the binary file.
Steps to save to Binary file like image or PDF into MongoDB.
- Create connection to MongoDB.
- Get the Database from Connection.
- Create GridFS object.
- Create GridFSInputFile and save.
Code snippet for saving binary file into MongoDB.
1 2 3 4 5 6 |
private void saveFile(final DB db, final File file) throws IOException{ final GridFS gridFs = new GridFS(db); final GridFSInputFile gridFSInputFile = gridFs.createFile(file); gridFSInputFile.setFilename(file.getName()); gridFSInputFile.save(); } |
The above given code can be written in one line.
1 2 3 |
private void saveFile(final DB db, final File file) throws IOException{ new GridFS(db).createFile(new FileInputStream(file), file.getName()).save(); } |
Both code snippets given above write the file into default bucket called ‘fs’.
We can write the binary file into a specific bucket also. For that we need to give the bucket name as the second argument to create GridFS object.
Write binary file into specific bucket.
1 2 3 4 |
private void saveFile(final DB db, final String bucketName, final File file) throws FileNotFoundException{ new GridFS(db, bucketName).createFile(new FileInputStream(file), file.getName()).save(); } |
Complete Sample Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
package com.ourownjava.mongo; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.net.UnknownHostException; import com.mongodb.DB; import com.mongodb.Mongo; import com.mongodb.gridfs.GridFS; import com.mongodb.gridfs.GridFSInputFile; /** * @author ourownjava.com */ public class MongoDBSaveBinaryFile { private static final String HOST = "localhost"; private static final int PORT = 27017; private static final String DB_NAME = "reports"; private static final String BUCKET_NAME = "pdfFiles"; /** * Create a mongodb connection * * @return Mongo Connection * @throws UnknownHostException */ public Mongo getConnection() throws UnknownHostException { return new Mongo(HOST, PORT); } /** * save a binary file into mongodb * @throws IOException */ private void saveFile(final DB db, final File file) throws IOException{ final GridFS gridFs = new GridFS(db); final GridFSInputFile gridFSInputFile = gridFs.createFile(file); gridFSInputFile.setFilename(file.getName()); gridFSInputFile.save(); } /** * save a binary file into a specific bucket in mongdb * @throws FileNotFoundException */ private void saveFile(final DB db, final String bucketName, final File file) throws FileNotFoundException{ new GridFS(db, bucketName).createFile(new FileInputStream(file), file.getName()).save(); } /** * * @param args * @throws IOException */ public static void main(String[] args) throws IOException { final MongoDBSaveBinaryFile dbSaveBinaryFile = new MongoDBSaveBinaryFile(); //get a mongodb connection final Mongo mongoConnection = dbSaveBinaryFile.getConnection(); //get the database from the connection final DB database = mongoConnection.getDB(DB_NAME); //save the file into default bucket called fs final File binaryFile = new File("/home/thosan/java/testfiles/test.pdf"); dbSaveBinaryFile.saveFile(database, binaryFile); //save the binary file into pdfFiles bucket dbSaveBinaryFile.saveFile(database, BUCKET_NAME, binaryFile); } } |
You can verify the binary file save operation using mongo or mongofiles command line tools.
List files using mongofiles utility
1 2 3 |
./mongofiles --port 27017 -d reports list connected to: 127.0.0.1:27017 test.pdf 42495 |
mongofiles utility connected to MongoDB instance installed in the localhost and listed all files in the reports DB.
Query files collection using mongo utility
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
> use reports switched to db reports > db.fs.files.find().pretty() { "_id" : ObjectId("51e0f48444ae8ad8d2202306"), "chunkSize" : NumberLong(262144), "length" : NumberLong(42495), "md5" : "9cb636fa9e46f09992690731d6d17424", "filename" : "test.pdf", "contentType" : null, "uploadDate" : ISODate("2013-07-13T06:32:36.124Z"), "aliases" : null } > db.fs.chunks.find().pretty() { "_id" : ObjectId("51e0f48444ae8ad8d2202307"), "files_id" : ObjectId("51e0f48444ae8ad8d2202306"), "n" : 0, "data" : BinData(0,"JVBERi0xLjEKJeLjz....." } |
How can you read the same binary file in chunks?