Clojure : why does this writer consumes so much heap space?


I have a 700 mb XML file that I process from a records tree to an EDN file.

After having do all the processing, I finally have a lazy sequence of hashmaps that are not particularely big (at most 10 values).

To finish, I want to write it to a file with

(defn write-catalog [catalog-edn]
  (with-open [wrtr (io/writer "catalog-fr.edn")]
    (doseq [x catalog-edn]
      (.write wrtr (prn-str x)))))

I do not understand the problem because doseq is supposed to do not retain the head of the sequence in memory.

My final output catalog is of type clojure.lang.LazySeq.

I then do

(write-catalog catalog)

Then memory usage is grinding and I have a GC overhead error at around 80mb of file writter with a XmX of 3g.

I tried also with a doseq + spit and no prn-str, same thing happen.

Is this a normal behaviour ?



Possibly the memory leaks due to the catalog values realization (google "head retention"). When your write-catalog realizes items one by one, they are kept in memory (obviously you're def'fing catalog somewhere). To fix this you may try to avoid keeping your catalog in a variable, instead pass it to the write-catalog at once. Like if you parse it from somewhere (which i guess is true, considering your previous question), you would want to do:

(write-catalog (transform-catalog (get-catalog "mycatalog.xml")))

so huge intermediate sequences won't eat all your memory

Hope it helps.

This video can help you solving your question :)
By: admin