I'm implementing persistent large constant arrays via mmap. Is there any tips and tricks or gotchas one should be aware when using mmap?
mmaping large files(for persistent large arrays)
Make sure you check for restrictions on open file size or memory usage. On Linux there is a built in shell command ulimit. Run as
ulimit -a to see the current settings.
Flush writes to the in-memory array to the file with the msync(2) syscall or else they may stay in memory until munmap(2) and there may be a power outage or something before then!
If multiple processes are mmap'ing the same memory region shared with read and write privileges, make sure that only one is writing to it at a time to avoid corrupting your data. Or use file locking or some other means of synchronization.
This is the most straight forward use case for mmap() so there shouldn't be much to trip you up.
You are effectively just loading a large constant array. Being constants you shouldn't need to worry about synchronization. It would be advisable to make sure the prot parameter is set to PROT_READ only since you won't be writing.
If one or more programs using the constants are going to be continually run, it might be worthwhile to have a separate program that loads the data and keeps it resident. Runs of the other programs then essentially are just doing an shared memory attach rather than continually reading the file into memory.
All pointers that are stored inside the mmap'd region should be done as offsets from the base of the mmap'd region, not as real pointers! You won't necessarily be getting the same base address when you mmap the region on the next run of the program. (I have had to clean up code that made incorrect assumptions about mmap region base address constancy).