Programming Tips - Algorithm: how to receive big multipart/form-data attachments

Date: 2020jun26 Platform: web Q. Algorithm: how to receive big multipart/form-data attachments A. When a browser uploads a file its sent as multipart/form-data. A boundary string is defined in the main header then its used to separate the binary files. The size is not explicitly stated anywhere. https://www.google.com/search?q=multipart/form-data+example This means as you are reading a large file (possibly gigabytes) you need to continually check for the boundary string. Quite difficult to do efficiently. It would be nice to avoid keeping the big file in memory. My solution, when only one file (last field in form) is being uploaded, is to parse the upload body until the binary file begins. Then write the rest (using buffered writes) into a file on storage. The means only a few KB of your large file is in memory at a time. So far so good. But what about the boundary? If the file is a common type it might not actually matter. For example a PNG file will just ignore it. But, of course, that's untidy. Because the boundary is only going to be the last 100 bytes (or so) of the file I read the file backwards from the end. And truncate just before the boundary. Pretty fast. Don't forget to prepend "--" to the start of the boundary. In pseudocode:
for (position = lastByte;; position--) { if (currentChar == '\n') { if (followingCharactersAreBoundary) { truncateHere Done! } } }
In Java:
static boolean isBoundary(RandomAccessFile raf, final byte []bBoundary) throws IOException { try { for (int i = 0; i < bBoundary.length; i++) { final byte b = raf.readByte(); if (b != bBoundary[i]) return false; } } catch(EOFException ex) { return false; } return true; } static void truncateAtBoundary(final String filename, final String strBoundaryIn) throws IOException { final String strBoundary = "--" + strBoundaryIn; final byte []bBoundary = strBoundary.getBytes(); RandomAccessFile raf = new RandomAccessFile(filename, "rw"); final long origLen = raf.length(); // Read backward to a newline for (long pos = origLen - 1; pos >= 0; pos--) { raf.seek(pos); byte b = raf.readByte(); if (b == '\n') { // There is always a \n before the boundary if (isBoundary(raf, bBoundary)) { // Check for a \r before the \n (if present we want to truncate it too) raf.seek(pos - 1); b = raf.readByte(); if (b == '\r') { pos--; } raf.setLength(pos); break; } } } raf.close(); }